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Abstract 

An intermediate complexity baroclinic model for the atmospheric jet at middle-latitudes is used 
as a stochastic generator of earth-like time series: in the present case the total energy of the system. 
Statistical inference of extreme values is applied to yearly maxima sequences of the time series, 
in the rigorous setting provided by extreme value theory. In particular, the Generalized Extreme 
Value (GEV) family of distributions is used here as a fundamental model for its simplicity and 
generality. Several physically realistic values of the parameter Te, descriptive of the forced equator- 
to-pole temperature gradient and responsible for setting the average baroclinicity in the atmospheric 
model, are examined. Stationary time series of the total energy are generated and the estimates 
of the three GEV parameters - location, scale and shape - are inferred by maximum likelihood 
methods. Standard statistical diagnostics, such as return level and quantile-quantile plots, are 
systematically applied to asses goodness-of-fit. The location and scale GEV parameters are found 
to have a piecewise smooth, monotonically increasing dependence on Te- This is in agreement with 
the similar dependence on Te observed in the same system when other dynamically and physically 
relevant observables are considered. The shape parameter also increases with Te but is always 
negative, as a priori required by the boundedness of the total energy of the system. The sensitivity 
of the statistical inference process is studied with respect to the selection procedure of the maxima: 
the roles of both the length of maxima sequences and of the length of data blocks over which the 
maxima are computed are critically analyzed. Issues related to model sensitivity are also explored 
by varying the resolution of the system. 

PACS numbers: 02.50.Tt, 02.70.-c, 47.11.-j, 92.60.Bh, 92.70.Gt 
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I. INTRODUCTION 



The study of climatic extreme events is of paramount importance for society, particularly 
in the fields of engineering and environmental and territorial planning. Indeed, temporal 
variations in the statistics of extreme events may have more acute and disruptive effects than 
changes in the mean climate j^J. In works of economical nature (see e.g. Nordhaus [3]), the 
special role played by the extreme events in terms of impacts is included with the hypothesis 
that the costs associated with climatic change can be represented as strongly nonlinear func- 
tions of the observed variations in surface temperature. This constitutes a clear motivation 
for which, when the impacts of climatic change are analyzed, the interest for variations in 
the statistics of extreme events plays a strategic role jisl. I^. 

In the scientific literature, some recent papers in which the existence of trends in the 
frequency of extreme (precipitation) events was pushed forward in quantitative terms are 
those by Karl et al. [22, 2^. Here the authors stated that the percentage of the U.S.A. with a 
much above normal proportion of total annual precipitation from extreme precipitation events 
(daily events at or above 2 inches) showed an increase from 9% in 1910-1920 to about 11% 
in the '90s. Despite severe scientific criticism to these papers by many other researchers 
in the field, the basic idea that the frequency of extreme events may change together with 
average surface temperature was discussed more and more and, eventually, it became one 
of the issues of analysis for the Intergovernmental Panel for Climate Change: a specific a 
specific report on Changes in extreme weather and climate events was issued in 2002 [lfil |. 
Basic questions, when dealing with extremes of complex processes, is: what is the correct way 
of measuring extremes? Are we concentrating on local or global fluctuations of the system in 
question? How do we measure local extremes? Extremes of wind speeds, of rainfall amounts, 
of economical damage? Moreover, the enhancement in the extreme events might be quantified 
either in terms of number of events, or in size of the average extreme event, or a combination 
thereof. Several other ambiguities make it often difficult to follow literature on the subject. 

Overall, two important weaknesses of much work on the subject of extreme meteo-climatic 
events and of their trends are: 

• the lack of interpretation of the dynamical mechanisms that should cause the hypoth- 
esized changes in the frequency of extremes of various nature; often such mechanisms 
are just alluded to instead of being explicitely formulated and analyzed; 

• the lack of a common and theoretically founded definition of "extremes" . 

The deficit in the first point above may negatively affect both deterministic and statistical 



studies of the phenomena in question. One major example on global processes is that, despite 
the great attention attracted by the subject, very few researchers have investigated in detail 
the basic mechanisms that should associate an increased CO2 concentration to enhanced 
extreme weather events. The chain of mechanisms possibly linking CO2 concentration and 
weather extremes is too long even for an adequate qualitative discussion here, but we shall 
concentrate on the basic sequence: enhanced surface temperature — > enhanced baroclinic- 
ity — > changes in the upper tail of the probability distribution function of the baroclinic 
disturbances. But no robust analysis of this complex dynamical "chain" has been offered so 
far. 

As for the second point above, the lack of a common rigorous framework for the statis- 
tical anaiysis of extremes (with few exception, S ueh as ,,UUU) P-ides a serious 
drawback for the interpretation and comparison of results from different studies. Moreover, 
this problem is not even justified, since mathematical theories of extreme events are well- 
developed O, l a, la, l9l. llfll I 111 l29j] and the derived methods are quite successful in many appli- 




cations |25l. 1401. I49L l50| . One basic ingredient of the theory relies on Gnedenko's theorem [11 1. 
which states that, under fairly mild assumptions, the distribution of the block-maxima of a 
sample of independent identically distributed variables converges to a family of three distinct 
distributions, the so-called Generalized Extreme Value (GEV) distributions. See Appendix 1X1 
for a brief description. Notice that one of the earliest applications of this theory in the natural 
sciences occurred specifically in a meteorologic-climatic setting [^J . Other statistical models 
for extreme events include the r-largest statistics, threshold exceedance models such as the 
generalized Pareto distribution, and point processes, see j^|. 

The reliability of parametric estimates for extreme value models strongly depends on the 
asymptotic nature of extreme value theory. In particular, at least the following issues should 
be checked or addressed jsj]: 

1. independence of the selected extreme values; 

2. using a sufficiently large number of extremes; 

3. using values that are genuinely extreme. 

Despite the importance of the third requirement, many studies actually deal with so-called 
soft extremes j26[ , which are maxima of too short data blocks or with too small return periods 
for the basic assumptions of the theory to hold. This is often the consequence of the limited 
amount of available data: on one hand, one has to restrict to maxima of data blocks, thereby 
discarding most available data; on the other hand one would like to have a long sequence of 



extreme values. The net result is that the assumptions of the extreme value theorems often go 
unchecked and are sometimes plainly impossible to check, since the available climatic records 
cover at best the last century. Therefore, thinking in terms of annual maxima, in such cases 
we only have 100 extremes. The inevitable consequence of adapting the definition of extremes 
to the needs of the work is a serious reduction of reliability of the resulting estimates. 

The goal of this paper is to infer and critically quality-check the statistical description of 
extreme values in the GEV distributions framework on the "earth-like" time series produced 
by a dynamical system descriptive of the mid-latitude atmospheric circulation featuring a 
chaotic regime. Such system has internally generated noise and can be effectively considered 
as a stochastic generator of data. Time series of the system's total energy E(t) are used, 
which is a relevant global physical quantity. We analyze how the GEV distribution inferred 
from block maxima of E(t) depends on the value of the most important parameter of the 
system, namely the forced equator-to-pole temperature difference Tg, which controls the 
baroclinicity of the model. The reliability of the GEV fits is studied, by considering both 
shorter sequences of extremes and soft extremes. Moreover, issues related to model error and 
sensitivity are briefly examined by analyzing the effects of variations in model resolution. The 
use of numerically generated data allows us to avoid all the difficulties related to shortness 
of the available climatic records, such as missing observations and low-quality data. In 
particular, we do not need to worry about the wastage of data caused by the selection of 
annual maxima, which is a serious limitation when considering observational data. In such 
arethodological sense, oar approach is shailar to that of Q as far as statist aaerer.ee 
is concerned. However, an important difference is that the statistics of the time series E(t) 
generated by the atmospheric model cannot be directly chosen: there is no explicit formula 
relating the probability density function of the adopted observable and the parameter T#. 
This problem we analyze elsewhere j^]. 

The structure of the paper is now outlined. In Sec. |H] we first describe the set-up of 
the numerical experiments performed with the atmospheric model and then the methods of 
statistical analysis of extreme values adopted for the total energy time series. The results for 
the considered reference case of 1000 yearly maxima are presented in Sec. IIHI Assessment 
of the sensitivity of the inferences is studied in Sec. IIV1 by varying the length of yearly 
maxima sequences, the block length over which maxima are taken, and the model resolution. 
The dependence of the GEV parameters with respect to Te is also analyzed in this section. 
Sec. E summarizes the results and their relation with the above discussion. The theory and 
the methods of Extreme Value distributions, as far as needed in the present work, are briefly 



reviewed in Appendix [XJ The model of the baroclinic jet used as a stochastic generator is 
described in Appendix |Bj referring to (^J for a thorough discussion. 



II. DATA AND METHODS 



A. Total Energy of the Atmospheric Model 



We consider a quasi-geostrophic intermediate complexity model [35J, |36 , |46j (also see Ap- 
pendix El), providing a basic representation of the turbulent jet and of the baroclinic conver- 
sion and barotropic stabilization processes which characterize the physics of the mid-latitudes 
atmospheric circulation. The model is relaxed towards a given equator-to-pole temperature 
profile which acts as baroclinic forcing. It features several degrees of freedom in the latitudi- 
nal direction and two layers in the vertical - the minimum for baroclinic conversion to take 



place 



44j . The system's statistical properties radically change when the parameter T E} 
determining the forced equator-to-pole temperature gradient, is changed. In particular, as Te 
increases a transition occurs from a stationary to an earth-like chaotic regime with internally 
generated noise. By chaotic, we mean that the system possesses a strange attractor in phase 
space For a detailed description of the model physics and dynamics see . 

In the present setting, the model is used as a stochastic generator of earth-like time series 
for testing the reliability of different statistical approaches j3] and studying the dynamics 
of extremes jsS]. A uniformly spaced grid of 21 values of the parameter Te is fixed in the 
range [10,50], starting from 10 and increasing with step 2. The baroclinic model is run for 
Te fixed at each of these values, producing 21 simulations of length 1000 years (preceded by 
an initial transient of five years) where the total energy E(t) is written every 6 hours. The 
formula of the total energy is given in Appendix [0 equation (jB15j) . We recall that, in the 
non-dimensionalization of the system, T# = 1 corresponds to 3.5 K, 1 unit of total energy 
corresponds to roughly 5 x 10 17 J, and t = 0.864 is one day, see j^] for details. 

For each of the selected values of Te, a chaotic attractor is numerically detected in the 
phase space of the model. This is illustrated by the autocorrelations of the time series of the 
total energy E(t) (Fig. Q), which decay to zero on a time scale that is comparable with that 
of the atmospheric system (roughly 10-15 days Since all parameters of the model are 

kept fixed in each simulation, by discarding the initial transient, the time series of E(t) may 
be considered a realization of a stationary stochastic process. 

The distribution of the total energy time series is visualized by means of the histograms 
and boxplots in Fig. 121 for three values of Te- Notice that, as Te increases, 

• the upper tail of the distribution becomes heavier, whereas the lower tail shortens; 

• both the average value and the variability of the total energy time series increase. 



The latter point is clearly visualized in Fig. El where the time-averaged total energy is dis- 
played for each of the 21 stationary time series, together with confidence intervals. Through- 
out the paper, confidence intervals are computed as average plus/minus sample standard 
deviation multiplied by 1.96. 

In concluding this section a theoretical remark is in order here. All examined strange 
attractors are implicitly assumed to possess a unique Sinai-Ruelle-Bowen (SRB) ergodic in- 
variant measure This is indeed a rather general and difficult problem in Dynamical 
Systems and Physics: on the one hand, existence of a unique SRB measure is necessary to 
rigorously associate a stationary stochastic process with the dynamical evolution law. On 
the other hand, existence of a unique SRB measure is a very strong regularity assumption 
for a dynamical system: in general it is even the question whether invariant measures exist 
at all and, if so, whether a finite or infinite number of invariant measures coexist for a given 
chaotic system. Moreover, even if an SRB measure exists and is unique, it is in general 
non-parametric: there is no explicit formula relating the statistical behavior to the system's 
equation and parameters. 



B. Parameter Estimation and Model Assessment in GEV Inference 

As discussed in the previous section, the time series we work with are characterized by 
fast decay of autocorrelations, which implies weak (short time-range) dependence of the 
observations, compare Fig. ^ Inference of threshold exceedance models is in this 

case complicated by the choices of suitable threshold values and cluster size for declustering 
(see e.g. 5|> Chap. 5]), which might be somewhat arbitrary in the applications. On the other 
hand, since the dependence is short-range, maxima of the total energy time series, taken 
over sufficiently large data blocks, are with good approximation independent. This is why we 
have preferred the use of the GEV with respect to threshold models. Moreover, since we can 
generate time series of arbitrary length, for simplicity we refrained from using the r-largest 
statistics, which is often valid alternative to the GEV, especially when data scarcity is an 
issue. In this section, therefore, we recall the methods of GEV inference as far as needed in 
the p r e 8en t wo rk . The exposition i 8 large* ba8 ed o n |. A, 8 o 8 ee RflflflflyQ for 
methodology and terminology of extreme value theory. 

Gnedenko's theorem ^lJjOr the three types theorem, first presented in a slightly less general 
form by Fisher and Tippet |9j) states that, under fairly mild assumptions, the distribution of 
the block-maxima of a sample of independent identically distributed variables converges, in 



a suitable limit, to one of three types of extreme value distributions. The three types are in 
fact special cases of the GEV distribution (also called von Mises type), having the following 
expression: 

G(x) = exp 



for x in the set {x : 1 + £(x — /S)/cr > 0} and G(x) = otherwise, with — oo < fi < +oo, 
a > 0, and — oo < £ < +oo. The quantities (/i, a, £) are called location, scale and shape 
parameter, respectively. In such a framework, statistical inference of extreme values amounts 
to estimating the GEV distributional parameters (/i, a, £) for a given time series and assessing 
the quality of the fit. If £ > (£ < 0) the distribution is usually referred to as Frechet 
(Weibull) distribution, if £ = we have the Gumbel distribution, which can be expressed as 
(|A5J) . See Appendix |X] for further theoretical details and [3, 0, 0, [ill 3] for examples and 
discussion. 

In practical application of the extreme value theory the parent distribution function of 
the data is typically unknown. Therefore, both the type of limiting distribution and the 
parameter values must be inferred from the available data and the quality of the resulting 
estimates should always be assessed. For GEV inference, a sequence of maxima is constructed 
by subdividing the available data {xi} into blocks of equal length and extracting the maximum 
from each block. The block length is one of the choices playing the usual, critical role between 
bias and variance in the parametric estimates. Too short blocks increase the length of the 
maxima sequence but, at the same time, they increase the risk of failure of the limit (jA4|) . If 
the blocks are too long, the resulting scarcity of maxima induces an enhanced uncertainty of 
the inferred values of the GEV parameters. In many situations a reasonable (and sometimes 
compulsory) choice is to consider the annual maxima (see P|). 

Assume that the observations in the time series are equispaced in time and that none 
of them is missing (both conditions are often violated in concrete cases, see e.g. j^)- Let 
n be the number of observations in a year and denote by M n l , . . . , M n m the sequence of 
the annual maxima, i.e., the maxima over data blocks of length n. Under the assumption 
of independence of the X n , the variables M n> i, . . . ,M niTn are independent as well. In fact, 
approximate inde pen dence of the M nyi holds also in the case of weak dependent stationary 
sequences, see P, I^J for definitions and examples. 

Among the numerous methods to infer the GEV parameters (graphical or moment-based 
techniques, see j^J), we adopt the maximum likelihood estimator for its great adaptability to 
changes of models. Denote by 6 = (//, a, £) the parameter vector for the GEV density g(x; 0), 
the latter being the derivative of G(x) = G(x; 6) in JIJ). In the stationary context, the block 



maxima of the observed data are assumed to be realizations of a stationary stochastic process 
having density g(x; 0°), where 6° is the unknown parameter vector. The maximum likelihood 
estimator 6° of 0° is defined as the value that maximizes the likelihood function 



L{0)=]\g{M nXi e) 



(2) 



In loose words, maximizing L(0) yields the parameter values for which the probability of 
observing the available data is the highest. It is often more advantageous to maximize the 
log-likelihood function 

m 

1(0) = log L(0) = log g{M n ,i, 0) (3) 



i=l 



and, according to (JTJ),we get 
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defined on the points M n ^ that, in the case £ 7^ 0, satisfy the condition l + £ 



> for 



all z = 1, . . . , m. Indeed, since the logarithm is a monotonic increasing function, the likelihood 
function reaches its maximum value at the same point as the log-likelihood function. 

Approximate confidence intervals for 6° are constructed using_the fact that each component 

of e° = 



3 2 , 0°3) = (yU , cr°, £°) = is asymptotically normal [5] 



6\^N{el^ hi ) Vi = l, 



(5) 



where ipij is a generic element of the inverse of the observed information matrix Io(0) defined 
by 

d 2 l(0)\ 



UO) 



dOidej . . 



= l,...,d 



(6) 



and evaluated in = 8°. From Eq. (j3J) one obtains the (1 — a)-confidence interval for 9°i. 



0? ± z«J$ iti 



(7) 



where z« is the (1 — «/2) quantile of the standard normal distribution. All confidence 
intervals in this paper are computed by formula ((7|), except when a more detailed analysis is 
presented. For example, in the assessment of inference quality, confidence intervals are also 
computed by a standard bootstrap procedure (applied to the sequence of annual maxima) and 



by profile likelihood. The latter technique consists in the following. Consider the parameter 
£, to fix ideas. The profile likelihood of £ is obtained by setting fi and a to their maximum 
likelihood estimates, fi° and er°, respectively, in the log-likelihood function / The plot of 
, fj, , £) as a function of £ is a section of the likelihood surface of (j3J) as viewed from the 
£-axis. Confidence intervals constructed from this graph are often more accurate than those 
obtained by observed information matrix, see ^] for examples. 

One of the main goals of extreme value theory is estimating the probability of occurrence of 
events that are more extreme than those that have been observed thus far. Let z p be the value 
that has a probability p to be exceeded every year by the annual maximum: P{M n ^ > z p } = p 
with < p < 1. In common terminology z p is called the return level associated with the return 
period 1/p. A maximum likelihood estimator for z p is obtained by plugging the estimates 
for 6° = [/I, cr, £] into the quantiles of G(x), obtained by inverting Eq. (£[]). This yields the 
estimator 



l-{-log(l-p)}- 



for f ^0, 

ju — a log {— log(l — p)} for £ = 0. 

The variance of the return level estimator z p is approximated as 

Var(z p ) w Vz T v VVz pi 



(9) 



where VzJ 



dp, ' 9tr' d£ 



V is the variance-covariance matrix: 



Var(p) Cov(fi,a) Cot^/i, 
Cov(a,fi) Var(a) Cov(cr,^) 
[Cov^ri Cov(£,a) Var{i) j 



(10) 



and both Vz p and V are evaluated at the maximum likelihood estimate 0° = [ju, a, £] . This 
allows the construction of confidence intervals for S p and is referred to as the delta method. 
Again, profile likelihood and a boostrap technique are used for goodness-of-fit assessment of 
the return level inferences. 

Only for Weibull distributions (£ < 0) it is possible to have p = 0, corresponding to a 
return level with an infinite return period. In this case, 



z = ji 



a 



'ID 



All information about the return levels is usually reported in the return level plot, where z p 
is plotted against log y p , where y p = — log(l — p) (compare Eq. (jSJ)). The return level plot 
is linear for the Gumbel distribution, concave for £ > (Frechet) and has the horizontal 



asymptote (JTTJ) (Weibull). Notice that the smallest vales of p are usually those of interest, 
since they correspond to very rare (particularly extreme) events. In the return level plots, 
events with a short return period (large probability p) are compressed near the origin of the 
axes, while outliers and rare events (small p) are highlighted. For this reason such plots are 
very useful tools for both model analysis and diagnosis. 

The above procedures for the estimates of return levels and GEV parameters require as- 
sessment with reference to the available data. Useful graphical checks are the probability 
plot, the quantile-quantile plot (QQ plot) and the return level plot. The first is the com- 
parison between the estimated and the empirical distribution function G(x). The latter is a 
stepfunction defined by 

C(A%) = (12) 

where M (i) is the order statistic for the sequenee M„ a , .... M of m b.oek maxima. Notice 
that the definition of the empirical d.f. (|T2j) is not unique, see |3J|. 
The QQ-Plot, formed by the points 

5_1 (^y)^w)- Vt = l,...,m} (13) 

highlights the behavior of the model tail, which is often the most interesting part. Substantial 
departures of the above plots from the diagonal indicate inadequacy of the GEV model or 
other systematic errors. Another diagnostic plot is constructed by adding confidence intervals 
for z p and return levels of the empirical d.f., according to Eq. (fTT?|) . to the return level plot 
(see above). Agreement of the empirical d.f. with the return level curve suggests goodness 
of fit and adequacy of the GEV model. 

All computations and plots in this paper have been made with the software 
R jl^ . available under the GNU license at www.r-project.org. The library ismev 
(www.cran.r-project.org), which is an R-port of the routines written by Stuart Coles 
as complement to has been used with minor modifications. 



III. GEV INFERENCES FOR 1000 ANNUAL MAXIMA 

The annual maxima are extracted from the 6-hourly time series of the energy described 
in Sec. OH Each series contains 4 x 365 x 1000 = 1460000 data. We then set n = 1460 
in (IA1|) . thereby obtaining sequences of 1000 annual extremes of the total energy. The yearly 
maxima are linearly uncorrelated (Fig.EJ), suggesting that it is safe and reasonable to assume 
independence. Also compare with the autocorrelation decay time in Fig. Q 



On theoretical grounds we can at least deduce one constraint on the distribution of ex- 
tremes for the energy time series. Indeed, since the attractor is contained within a bounded 
domain of the phase space and since the energy observable E(t) defined in (jB15j) is a con- 
tinuous function of the phase space variables, it turns out that the total energy is bounded 
on any orbit lying on (or converging to) the attractor. Therefore, the energy extremes are 
necessarily Weibull distributed (£ is negative). This provides a theoretically founded criterion 
for quality assessment of the obtained GEV inferences. 

The GEV parameters (fi, a, £) are estimated by the maximum likelihood method (see 
Sec. Ill B|) from the sequences of yearly maxima. The fitted values of (//, a, £), together with 
confidence bands (computed by the observed information matrix, formula (J7J)) are plotted as 
functions of T# in Fig. The inferred parameters /i and a increase monotonically with T#. 
Estimates of £ are in each case negative and the related confidence intervals are markedly 
bounded away from zero: observed information matrix, profile likelihood and bootstrap yield 
similar estimates. The latter result matches quite well the theoretical expectation discussed 
in the previous section. Also notice that the uncertainty in £ may reach up to 21% of its value, 
whereas the parameters \x and a are quite accurately estimated: the maximal uncertainties 
in fi and a are 0.1% and 2.5% of the corresponding value, respectively. 

Information on the tails of the energy distribution is straightforwardly expressed by the 
return level plots, where z p = G _1 (l — 1/p) is the return level associated to the p-year return 
period and G is the GEV distribution (JI}. In Fig. return levels with return periods of 10, 
100 and 1000 years are plotted as functions of Te- Each graph is monotonically increasing 
with Te and, for Te fixed, the return levels increase with the return period. 

The dependence of the GEV probability density with respect to Te is illustrated in Fig. [7| 
The increase of scale and location parameters with Te induces a rightward shift and a spread 
of the probability density. In particular, from the geophysical point of view, both the range 
and severity of possible extreme values of the total energy increase with Te- In fact, this 
behavior sets in for Te right after the creation of the chaotic attractor, see Fig. right. 

A. Smoothness of GEV Inferences with Respect to System Parameters 

The dependence from Te of the time-averaged total energy and of the inferred GEV 
parameters (including the return levels) is rather smooth, see Fig. Eland Fig. El This strongly 
suggests the existence of functional relations of the form 

= aTg and a = ciT^ . (14) 



Such power laws are fitted to the graphs of fi and a as follows. 

To set ideas, we consider /i and denote by fi(T E ) and <r^(Tg) the maximum likelihood 
estimate of fi and the related standard deviation (calculated by the observed information 
matrix), respectively, where T 3 E is one of the 21 chosen values in the interval [10,50]. A 
bootstrap procedure is performed where iterated realizations of a sequence of 21 independent 
Gaussian variables with mean £l(T e ) and standard deviation ap,(T E ) are simulated. For each 
realization, a power law fit as in (JTljl is performed. The sample average and standard deviation 
of the so obtained fits, constructed independently for \i and a, are reported in Tab. |U and 
Tab. HQ 

Two distinct ranges of Te are identified, where /x scales by a different exponent, also see 
Fie. Elleft. For Te < 18 7,, is ~ 1.73 while it decreases to ~ 1.6 for Te > 18. The time-mean 

n 

total energy of the system has a rather similar power- law dependence on Te |35J. In the 
upper Tg-range the exponent of the power law of the extremes is larger than that of the 
time-mean total energy (~ 1.52), which implies that asymptotically the extremes tend to 
become relatively more extreme. When considering a, there is an initial interval of Te where 
no power law is obeyed, see Fig. IHlleft. For 22 > Te > 15 7 CT is ~ 3.0 while it decreases to 
~ 2.1 for Te > 22. Since 7 CT > 7 M for high values of Te, we have that asymptotically with Te 
the spread of the maxima tends to become consistent with respect to their average location, 
thus suggesting a larger variability in the maxima. Shorter yearly maxima sequences, of 
length 300 and 100, lead to nearly identical estimates for both 7^ and 7^ and their confidence 
intervals, thus implying that this is a rather robust property of the system. 

Apart for the total energy, it turns out that analogous power law dependence with respect 
to Te is detected in the considered model for several dynamical and physical observables, 
such as Lyapunov dimension, maximal Lyapunov exponent and average zonal wind j^] . This 
suggests that the whole attractor of the model (more precisely, its SRB measure) has some 
scaling laws with respect to Te- The qualitative features described above for Te sufficiently 
large, such as the form of (//, a) as functions of Te and the fact that £ seems to approach a 
constant negative value, are most probably related to this scaling behavior. An important 
question we address elsewhere is whether this is a peculiarity of the baroclinic model used 
here or if analogous smoothness properties are common (generic or robust in some sense) for 
models of atmospheric dynamics, including General Circulation Models. 



IV. SENSITIVITY OF THE GEV INFERENCES 



The length of 1000 for the sequences of yearly maxima turns out to yield good accuracy 
for the GEV inferences. The sensitivity of such results has been tested by relaxing the 
experimental conditions considered in the previous section. This has been done in several 
ways: 

• by varying the number of extreme events (length of the sequences of yearly maxima); 

• by using soft extremes (maxima are computed over data blocks corresponding to time 
spans shorter than one year). 

• by varying the resolution of the model. 

The best estimates and related uncertainties of the GEV parameters obtained under modified 
experimental conditions have been first compared at phase value to what obtained in the 
reference case, in order to detect mismatch due to biases and changing precision. Moreover, 
the resulting differences in the GEV distributions have been inspected also with by adopting 
the standard graphical diagnostics, such as quantile-quantile and return level plots, and by 
computing bootstrap confidence intervals and profile likelihood both for the critical parameter 
£ and for the return levels. 

A. Sensitivity with Respect to the Extreme Events Sample Size 

We describe what is found when reducing the number of yearly maxima used for GEV 
inference. The, particularly unfortunate, case occurring for Te = 32 is first analyzed by 
means of profile likelihood for the GEV parameter £. Sequences of 1000, 300, 100, and 50 
yearly maxima of the total energy are used to produce the plots in Fig. The cases 1000, 
300, and 100 yield coherent estimates for £. For detailed diagnostics, confidence intervals are 
computed by the observed information matrix (formula (J7J)) and compared by those obtained 
by profile likelihood and by a standard bootstrap procedure. The three methods yield similar 
results in all cases, both for the estimates and for the confidence intervals. However, for 
50 maxima the confidence intervals become very wide and a positive value for £ is inferred, 
which is unphysical. 

The decay of the inference quality is revealed in a different way by the profile likelihood 
plots for the 100-year return levels (Fig.lllJI. It is, in general, not quite safe to infer, from a se- 
ries of n annual extremes, return levels with return periods larger than n years. Extrapolation 



to larger return periods may produce incorrect values and is likely to yield significant uncer- 
tainties. For the considered case, the estimates are coherent for 1000, 300, and 100 yearly 
maxima. As expected, the confidence intervals (computed by the 5 method, see Sec. I11BJ) 
become larger as shorter sequences of maxima are used. This also holds for bootstrap and 
profile likelihood. However, for 50 maxima the profile likelihood confidence intervals become 
very skewed as opposed to bootstrap or S method. This clearly indicates poor approximation 
of normality for the GEV estimators revealing the intrinsic unreliability of the estimates. 

Quantile-quantile and return level plots for the above inferences are reported in Fig. ED 
These confirm excellent quality for 1000, 300, and 100 maxima, whereas they reveal that 
something must be wrong for 50. In the quantile-quantile plots (top row of Fig. I12|) . from 
left to right there appear increasing departures from the diagonal in the tails, especially the 
upper tail, whereas the central part of the distribution does not suffer from sample reduction, 
except in the case of 50 maxima. Analogous effects occur in the upper tail of the return level 
plots. The main point here is that the most delicate part of an extreme value inference is 
the behavior of the tails. Usually, this is also the aspect one is most interested in. Notice 
how the black line in the middle of the return level plot for 50 maxima erroneously suggests 
unboundedness of the return levels (which is only possible for £ > 0, see Sec. Ill B|) . Therefore, 
extrapolations to high levels should be avoided in this case. 

We emphasize that the value of Te just examined corresponds to a particularly bad in- 
ference for 50 years. An overview, throughout the considered range of Te, of GEV inference 
sensitivity to length reduction is summarized in Fig. El where the cases of 300, 100, 50 yearly 
maxima are plotted against 1000. The quality of the fits, of course, generally decreases when 
using shorter series of maxima. Inference of £ is particularly sensitive to the length of the 
series of maxima: the maximal value of the ratios between uncertainty in £ and value of the 
corresponding maximum likelihood estimate of £ is 600%, 1387%, 45%, and 21%, for 50, 100, 
300, and 1000 maxima, respectively. The median of those ratios is 48%, 30%, 19%, 10%, 
respectively. Taking only 50 maxima yields two positive estimates of £ (for Te = 32 and 
50), which is an unphysical result, and overall very large uncertainties: for many values of 
Te, confidence bands for £ include part of the positive axis. The bias in the estimates of £ 
induces a significant alteration in those of a, although the inferred values of fi remain quite 
stable. Also notice the big uncertainties in the return levels for the two cases Te = 32 and 
50, corresponding to positive estimates of £. 

If we further consider additional sources of difficulty present in nature (for example the an- 
nual seasonal modulation), skepticism with respect to several inferences on extremes proposed 



in meteo-climatic literature seems justified. 



B. Sensitivity with Respect to the Extreme Events Selection Procedure: Soft Ex- 
tremes 

We now turn to the secon d ty pe of inference sensitivity mentioned above, obtained by 



using so-called soft extremes |26j instead of genuine extremes. In the present setting, we 
simulate the usage of soft extremes by considering sequences of maxima over data blocks 
that correspond to time spans shorter than one year, in particular 0.6, 1.2, and 3 months. 
In the first two cases, and especially in the first, we are not even sure that the considered 
maxima are effectively uncorrelated, which is the typical situation in real systems. In each 
case, the number of considered extremes is kept fixed to 1000, so that the difference is only 
determined by block length. 

The net result of using shorter time spans is the introduction of a progressively larger bias 
in the GEV inferences. The location and shape parameters are systematically underestimated. 
For the location parameter [i (leftmost column in Fig. 111}) the underestimation increases when 
taking maxima over shorter time-spans, but it also increases with Te- Notice that this is quite 
different from the effect of reduction of the number of maxima, compare Fig. (leftmost 
column). The sample medians of the relative differences between the estimates of /i for 12 
months and those for 3, 1.2, and 0.6 months (where the sample is indexed by the values 
of Te for which the estimates are computed) are 3.2%, 5.7%, and 7.5% for 3, 1.2, and 0.6 
months, respectively. Due to the definition of z p , see (JHI), the underestimation of the 100-year 
return levels is a consequence of that of fi. Also notice that the variations in the return 
levels connected to increase in Te are much larger than those induced by usage of either soft 
extremes or shorter data sets, also compare with Fig. EH (rightmost column). Conversely, the 
scale parameter a (second column from left in Fig. ITl)) is largely overestimated: the sample 
median of the relative differences between the estimates of a are 31%, 59%, and 82% for 3, 
1.2, and 0.6 months, respectively. So in our case, taking soft extremes mistakingly suggests 
an enhanced variability in the extreme values. 

Qualitatively, the response of the GEV estimates to the usage of soft extremes is explained 
by the introduction of much more data in the central part and in the lower tail of the 
distribution of the selected extreme values. From this fact, the underestimation of fi follows 
directly. Moreover, since the range of the extreme events distribution gets wider, a larger 
variability is artificially introduced and this is indicated by an overestimated scale parameter 



a. Lastly, the upper tail of the so obtained distribution of extremes looks more squeezed, 
given the wider extension at lower values. This corresponds to a more negative value of £, 
compare the third column from left in Fig. ITU 

C. Sensitivity with Respect to the Model Resolution 

In this section we analyze the response of the GEV inferences to variations in the model. 
In fact, this question is a further aspect of the smoothness and robustness issues discussed 
in Sec. 1111 Al which is of great practical importance: we would not like our estimates to 
drastically change if the model is slightly altered. Different choices are possible, such as 
introducing an orography in the bottom layer or changing the lateral boundary conditions. 
In the present setting, however, we confine ourselves to compare simulations of the baroclinic 
model computed at different resolutions (i.e., with different spectral discretization order JT, 
see flEH - flEU )- 

In particular, time series of the total energy, of length 1000 years, are computed with the 
baroclinic model using four different resolutions: JT = 8, 16, 32, 64 (resolution JT = 32 is 
used throughout the rest of this paper). In each case the GEV parameters are estimated 
from sequences of 1000 yearly maxima. The results are compared with each other in Fig. ITBI 
The relative differences of the estimated values of /i between the case JT = 32 and each of 
the other three cases (panel (A)) remain rather small: they are less than 1.5% for JT =16 
and 64 and grow up to about 4% for JT = 8. Also the estimates of £ in general agree quite 
well for all the considered resolutions. 

More pronounced differences appear in the inferred values of the scale parameter a: for 
Te > 26, the estimates obtained with resolutions JT = 8 and 64 are larger than those for 
JT = 16 and 32. The estimates for \i closely reflect the behavior of the time-averaged the 
total energy (computed on the same time series from which the yearly maxima are extracted). 
Considering, to fix ideas, the range Te € [26,36], for each fixed Te both the inferred values 
of /j, and the time-averaged total energy (not shown) decrease as JT increases. Conversely, 
there is no simple relation between the sample standard deviation oe of the total energy 
time series and the GEV scale parameter a: for the mentioned values of Te, the sample 
standard deviation <je decreases for larger JT (not shown), whereas this is not so for the 
scale parameter, see above. 

Power law fits of \x and a as functions of Te are performed for 1000 yearly maxima of the 
total energy, where the baroclinic model is run with four different resolutions: JT = 8, 16, 



32, 64. As in Sec. IIII Al the range of Te is divided into two intervals for the fits of [i and into 
three for a (in the latter case, no power law is found in the leftmost interval). Remarkable 
accuracy and coherence of the laws for /i is observed. There is more variability in the power 
laws for a, although again a striking coherence is observed for large T#. 

Summarizing, we have observed no dramatic model sensitivity for the GEV estimates. 
However, it is to be emphasized that a particularly stable observable has been examined 
here (the total energy) and only one type of model alteration has been considered, namely a 
change in the spectral resolution. We believe, though, our results are quite "generic" for the 
class of models considered in this paper. 



V. SUMMARY AND CONCLUSIONS 



In this paper we have performed statistical inference of extreme values on time series 
obtained by means of a dynamically minimal two-level quasi-geostrophic model of the atmo- 
sphere at mid-latitudes. The physical observable used to generate the time series is the total 
energy of the system and the statistical model for the extremes is the Generalized Extreme 
Value distribution (GEV). Several physically realistic values of the parameter Te, descriptive 
of the forced equator-to-pole temperature gradient and responsible for setting the average 
baroclinicity in the atmospheric model, are examined. In the standard setting, the maxima 
of the total energy are computed over one year long data blocks, and 1000 maxima are used 
as basis for the inference. 

A result of the present investigation, having potential relevance in atmospheric dynamics, 
is the detection of a piecewise smooth dependence of the location and scale GEV parameters 
(/i, a) on the model parameter Te controlling average baroclinicity. Two distinct power-laws, 
holding in different intervals of Te, are obtained both for /i and for a as functions of Tk, where 
the fit for /i is quite accurate. This regularity is put in relation with the results in |35[, where 
analogous scaling laws are found for other dynamical indicators, such as Lyapunov exponents 
and dimension, and physical observables, such the time-space average of total energy and 
zonal wind. The shape parameter £ also increases with Te but is always negative, as a priori 
required by the boundedness of the total energy of the system. We conjecture that also the 
dependence of £ on Te becomes smooth when much longer time series are considered. All 
these problems wiil be further explored in connected work. 

After the assessment of the goodness-of-fit by means of standard statistical diagnostics, 
such as return level and quantile-quantile plots and computation of confidence intervals by 
different procedures, we have consistently verified that: 

• the selected block length of one year guarantees that the extremes are uncorrelated 
and genuinely extreme; guaranteeing this property may result more problematic when 
dealing with real observations because of seasonal modulations, etc.; 

• the considered length of the series of maxima (1000 data) yields reliable parameter 
estimates; 

• the GEV inferences are not dramatically affected by structural changes in the atmo- 
spheric model adopted in the present work. 

The sensitivity of the statistical inference process is first studied with respect to the selection 



procedure of the maxima: we analyze the effects of reducing either the length of maxima 
sequences or the length of data blocks over which the maxima are computed. 

The first point is checked by repeating the GEV inferences with sequences of maxima 
having lengths 300, 100, and 50 years. The estimates are coherent for 1000, 300, and 100 
yearly maxima, but the confidence intervals of the best estimates, not surprisingly, widen 
up as shorter sequences of maxima are used. Moreover, markedly unreliable estimates are 
obtained when only 50 yearly maxima are considered: the estimated long-term return levels 
are patently wrong, the uncertainty of the inferred shape parameter £ is very large, and the 
best estimate of £ is positive (that is, unrealistic) for a few values of Te- 

In order to address the second point, we have taken maxima over data blocks corresponding 
to shorter time spans, to explore the effects of using soft extremes Specifically, the 

sensitivity of the GEV inferences is analyzed with respect to shortening the length of the data 
blocks to 3, 1.2, and 0.6 months. The obtained statistics is "polluted": a bias is introduced 
which is unacceptable for the cases of 1.2 and 0.6 months and still significant (at least for the 
GEV parameter a) for 3 months. Moreover, the parameter £ tends to be underestimated. 
Taking shorter maxima sequences results in even larger uncertainties, very large for the case 
of 50 yearly maxima. Physically unrealistic values of £ may also be obtained. 

Lastly, issues related to model sensitivity are also explored by varying the (spectral) reso- 
lution of the system. It turns out that the GEV estimates are in general rather robust under 
this sort of perturbation. Summarizing, to get a good inference many maxima are required 
and they must be genuinely extreme, that is, taken over sufficiently large data blocks. Failing 
to fulfill these requirements may result in affecting the GEV inferences much more critically 
than adopting a baroclinic model with lower resolutions. 

We conclude by highlighting that the parameterization of physical observables with respect 
to an external forcing is indeed a rather general and difficult problem in the dynamical analysis 
of the physical system. Existence of a unique Sinai-Ruelle-Bowen measure is required to 
rigorously associate a stationary stochastic process to the dynamical evolution law. However, 
even if an SRB measure exists and is unique, typically there is no explicit expression in terms 
of the system's equations and parameters |7[ . 

In this respect, the simplicity and the universality of the GEV model can be exploited 
to characterize chaotic systems by focusing on extreme values of suitable time series, rather 
then examining the distribution of all states visited by the system in phase space. Different 
model variants (both in boundary conditions and in model structure) and other observables 
and will be considered in future research. 
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APPENDIX A: CLASSICAL THEORY OF EXTREME VALUE DISTRIBUTIONS 



Let Xi, . . . , X n be a sequence of independent and identically distributed random variables 
(i.i.d.r.v.) where Fx is the common distribution function (d.f.). The classical theory of 
Extreme Values deals with the statistical behavior of the random variables 

M n = max{X u ...,X n }, (Al) 

which is the maximum of the first n variables (an analogous theory for the minima is de- 
veloped similarly, since min{Ai, . . . , X n } = — max{Xi, . . . , X n }). Under the assumptions of 
statistical independence and distributional equality of the X{, we known that 

P{M n <x} = P{X 1 <x,...,X n <x} = P{X 1 < x} ■ . . . ■ P{X n <x} = F x . (A2) 

However, in most practical applications this property is useless because, typically Fx is 
unknown. Moreover, the limit of F x is degenerate, since it is concentrated on the point 
x + = sup{x : F(x) < 1}: 

x < x+, 

lim = F x (x) = { (A3) 

1 X > x + . 



n— >+oo 



This difficulty is avoided by assuming the existence of two sequences of constants, {o~ n > 0} 
and {/i n } 5 such that M* = Mn ~ tXn ; rather than M n , has a nondegenerate limit distribution 
G(x): 

P{M* <x}^ G(x) (A4) 

for each continuity point x of G. 

Theorem A.l (Extremal Types Theorem). If there exist sequences of constant {a n > 0} 
and {/i n } such that the limit in \A$ exists, then the d.f. G(x) belongs to one of the following 
three parametric forms, called Extreme Value Distributions: 

x — fj, y 



Type I ( Gumbel): G\{x) = exp < — exp 



a 



-oo < x < +oo (A5) 



X ^ /i 

Type II (Frechet): G 2 (x) = { (A6) 

exp{-(^)- 5 } x>fi, e>0 

fexpj- [- (^)l 5 ) x</x, e<0 
Type III (Weibull): G 3 (x) = 1 I L V . /J J r (A?) 

I 1 X > /A 

with scale parameter o > 0, location parameter fi G M and, for the types II and III, the shape 
parameter £ ^ (for type I it is assumed £ = 0). 



This result was first proved by Fisher and Tippett jjj and then it was extended by Gne- 
denko [llj]. The strength of this theorem is the fact that it is a universal property, since it 
holds regardless of the parent distribution F X - Notice that: 

• Theorem IA. II does not guarantee the convergence in distribution for the variable M*. 
In fact it assumes it (compare (jA4|) ) . There are distributions for which the convergence 
requested in (|A4|) does not hold, see J3]. 

• The value of x + is finite only for the Weibull distribution, whereas the Frechet and 
Gumbel densities decay poly normally and exponentially as x — > +00, respectively. 

• The domain of attraction D(Gi), where G\ is one of (|A5[) - ([A7[) . is defined as the set 
of all d.f. F(x) such that one has convergence to Gi(x) in (jA4|) . Various criteria give 
necessary and sufficient conditions to determine what is the domain of attraction of 
each of the extremal distributions (jA5|) - (jA7|) . However, the limit type for a given F(x) 
only depends on the upper tail of F(x). See j^J. 

• The hypotheses of Theorem IA.1I can be relaxed to the case of stationary stochastic 
processes with weak long-range dependence (at extreme levels) 0, Chap. 5], which is 
of particular importance in our case. 

The Gumbel, Frechet and Weibull families are unified into the single GEV family of distribu- 
tion functions, given in (JTJ. So in the sequel we denote by G(x) the family in (JT|). For positive 
values of the shape parameter £, the Frechet family is obtained from and, similarly, for 
negative values we have Weibull. The Gumbel distribution is the limit for £ — > of G(x): 



lim G(x) = exp <^ — exp 



x — 11 



a 



(A8) 



This highlights another strength of the GEV model in concrete applications: the limit type 
is inferred from the data by estimating the parameter £. This removes the necessity of an 
initial and arbitrary choice of the limit type when using models (IA5I) - (|A7|) . 



APPENDIX B: A MODEL FOR THE MID-LATITUDES ATMOSPHERIC CIR- 
CULATION 



As mentioned in the introduction, the stochastic generator of the energy time series used 
in this paper is a model for the baroclinic jet at mid-latitudes. The system is relaxed towards 
a prescribed north-south temperature profile, where the gradient is controlled by a parameter 
Tg. In fact, the parameter Te controls average baroclinicity of the system and is used to study 
the relation with extreme values of the energy time series. Many dynamical properties of the 
model depending on Te have been analyzed before the analysis of extreme values presented 
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461 ] . to which we also refer for a 



in this paper, providing a sort of road map. See 
detailed derivation of the model and for discussion on the physics involved. In this section, 
we confine ourselves to a brief sketch. 

Starting point for the construction of the model is the two-level quasi-geostrophic equation: 

l A - r " Jf Jr + J ( r > ^ +(3y+ Wi*) + J (*> A ^ = 

^A H (<P-r)-^A H T + 2 -^(T-r*), (Bl) 



^A H <P + J (</>, A H <P + /3y) + J (r, A H r) = ~^A H (0 - r) . (B2) 

Here r and </> are the baroclinic and barotropic components, respectively, of the streamfunc- 
tion ipi and ^3 at the two levels: 

r = ^i-^ 3 ), 0=i(^i+^ 3 ), (B3) 

Ah is the horizontal Laplacian, 1 / H\ is the Froude number, (3 is the gradient of the Coriolis 
parameter, ue, k and z/jy parameterize the Ekman pumping at the lower surface, the heat 
diffusion, and the Newtonian cooling, respectively. 

The system is driven for the baroclinic component by the term in (r — r*) in ()B1J) . which 
forces a relaxation to the radiative equilibrium r* with a characteristic time scale of 1/vn- 
We take 

so that Te is the forced temperature difference between the low and the high latitude border 
of the domain. In this sense, the parameter Te is responsible for average baroclinicity of the 
system and is the control parameter we vary to test changes in the extreme value statistics. 

The fields <p and r are expanded in Fourier series in the longitudinal direction x. Moreover, 
in order to avoid wave-wave nonlinear interactions, only the terms of order n — 1 and n = 6 



are retained (see [35[ for details). This yields 

ry 

4> (x,y,t) = — / U (z, t) dz + Aexp (ix x ) + c - c - (B5) 
ry 

r(x,y,t) = — I m {z, t) dz + B exp (ixx) + c.c. (B6) 
./o 



By substitution into (jBl|) - (|B2|) . one obtains 

A yy - X 2 A + ( i x U + ) A ra - ( i x 3 U + i X ?7 ro + ~-^X 



Byy - X 2 B-— 2 B + [i X U + =-^ + ^)B y y 



I?) Ayy - (i X *U + i X tf w + ^X 2 
/ 2u E \ / n 2z/g 2 \ 

r xm ~ #f J ra ~ yx m + l x m yy - -JjjX 1-8 = 0, 

2 • / r 2z/ s 2k 



(B7) 



(i X *U + i X l/ w + |f X 2 - ix/? + ^X 2 + |f + J^tf) 5 (B8) 



\ ^ /. 3 2z/g 2 



■ ( ' l X m - -JjwJ A yy ~ I *X m + ixm yy - -j^x ~ Jq^m ) A = 0, 



2z/ £ 



^ + -E7r(^ - m ) + 2xIm(AA* + BB*) = 0, (B9) 



H. 



2 



2 . 2k 2ve , tt 2vn 



u yy u2 1 tt2 yy tt2 v- ■■-/m rri 
n 2 N 2 N 2 ti 2 



(U - m) yy - -^(m - m* 



+ -±- 2 x\m{A*B) yy + 2 X Im(AB* y + BA* y ) ym = 0, 

U 2 



(BIO) 



where the dot indicates time differentiation and A* denotes the complex conjugate of A. This 
is a set of 6 equations for the real fields A 1 , A 2 , B 1 , B 2 , U, m, where A 1 and A 2 are the 
real and imaginary parts of A and similarly for B. Rigid walls are taken as boundaries at 
y = 0, L y , so that all fields have vanishing boundary conditions. 

A system of ordinary differential equations is obtained from (|B7Jl - (|B10J) by means of a 
pseudospectral (collocation) projection, involving a Fourier half-sine expansion of the fields 
of the form 



JT 



JT 



^ = ^A}sin(^J, z = l,2, (Bli; 
\ y / 



3=1 



^ = 5>5sin(^), i = l,2, (B12) 



L 

i 

JT 



^ = £^sin(^V (B13) 

j=i \ f / 

m = mj sin ( J . (B14) 

The resulting system is the generator of the time series used in this paper for extreme value 
analysis. In particular, as observable (that is, as function of the state space yielding the 



time series) we choose the total energy E(t) of the system, obtained by integration in the 
(x, y)-domain of the energy density: 



e{x,y,t) = — 
9 



2 (V^3 



+ 



2Hi 



(B15) 



Here the factor Sp/g is the mass per unit surface in each level, the first two terms inside the 
brackets describe the kinetic energy and the last term describes the potential energy. We 
emphasize that in the expression (|B15|) the potential energy term is half of what reported 
in ^jj, which contains a trivial algebraic mistake . 

It turns out that the order JT = 32 in the expansion (jBll|) - (jB14|) is sufficiently high 
to have an earth-like chaotic regime characterized by intermediate dimensionality in suitable 
ranges of the parameter T#. By chaotic, we mean that the dynamics takes place on a strange 
attractor with internally generated noise. By earth-like we mean that the time-dependent 
Fourier coefficients in (|Bll)l - (|B14jl . as well as the total energy and mean zonal wind, have 
unimodal probability densities. The mentioned chaotic range is Tg > T^" lt , where T^ lt = 8.75 
approximately. For lower values of Te, the Hadley equilibrium (stationary solution) is stable 



36l . [iq l for a complete discussion. 



and is therefore the unique attractor. Again see 
Throughout this work, we consider JT = 8 , 16, 32, and 64 and the considered parameter 
range is 10 < T# < 50 with integer steps of 2. 
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7i 


1 E 


72 


1000 


1.7310 ±0.0007 


18 


1.6019 ±0.0017 


300 


1.7311 ±0.0013 


18 


1.6017 ±0.0030 


100 


1.7219 ±0.0018 


17 


1.5982 ±0.0018 



TABLE I: Power law fits of the location parameter \i as a function of Te of the form oc Tg, 
performed in two adjacent intervals of Te- The number of used annual extremes is n and T\ is the 
value of Te separating the two intervals. Compare with figure Fig. |SJ 
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L E 


7i 


If 


72 


1000 


15 


3.011 ±0.076 


22 


2.140 ±0.025 


300 


14 


3.236 ±0.115 


22 


2.114 ±0.047 


100 


14 


2.944 ±0.157 


24 


2.040 ± 0.093 



TABLE II: Same as Tab. [I] for the scale parameter a. Here the fits a = Te 1 and a = Te 2 hold 
for T E such that Tf <T E < Tf and Tf <T E < 50, respectively. No power law fit is found for 
T E < T b E l . Compare with Fig.H 



JT 
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rpb 

1 E 


72 


64 


1.7346 ± 0.0008 


15 


1.6027 ±0.0005 


32 


1.7310 ±0.0007 


18 


1.6019 ± 0.0005 


16 


1.7027 ± 0.0007 


18 


1.5982 ± 0.0007 
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1.6794 ± 0.0006 


22 


1.5977 ±0.0011 



TABLE III: Power law fits of the location parameter \x as a function of Te of the form fx oc T~l. JT 
indicates the spectral resolution (number of Fourier modes) of the baroclinic model and T E is the 
value of Te dividing the two considered intervals, see text for details. 



JT 
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7i 


T*2 
1 E 


72 


64 


18 


2.514 ±0.046 


32 


2.067 ±0.055 


32 


15 


3.011 ±0.076 


22 


2.140 ±0.025 


16 


15 


2.821 ± 0.045 


26 


2.150 ±0.033 
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17 


2.675 ± 0.065 


26 


2.149 ±0.033 



TABLE IV: Same as Tab. II 111 for the scale parameter a oc 1%. The interval [Tf,Tf] is the ranee 
of validity of the the first power-law, having exponent 71. The point dividing the two considered 
intervals is T| 2 . No power law is detected for Te < Tf. 
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FIG. 1: Autocorrelations of the total energy time series for Te = 10,30,50 (left, center, right, 
respectively), time-lag in days on the horizontal axis. The full 6-hourly time-series of 1000 years 
have been used, see Sec. Ill Al 
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FIG. 2: Histograms and boxplots of the total energy time series for Te = 10,30,50 (left, center, 
right, respectively). 




FIG. 3: Time-averaged total energy (vertical axis) as a function of Te (horizontally), for each of 
the 21 selected values of Te- Confidence bands (average plus or minus a 1.96 times sample standard 
deviation) are added. The full 6-hourly time-series of 1000 years have been used, see Sec. Ill Al 
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FIG. 4: Autocorrelations of the sequences of 1000 yearly maxima of the total energy time series 
for Te = 10, 30, 50 (from left to right, respectively). 
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FIG. 5: From left to right: maximum likelihood estimates of [i, a, and £, respectively (vertical 
axis), for each of the 21 sequences of 1000 maxima of the total energy, against the corresponding 
values of Te (horizontal axis). Confidence intervals are added with errorbars but are hardly visible 
for n (leftmost panel) at the selected scale. 




FIG. 6: Maximum likelihood estimates of the 10-, 100-, and 1000-year return levels of the total 
energy (red, green, and blue, respectively) for each of the 21 stationary series of 1000 maxima of 
the total energy, against the corresponding values of Te (horizontal axis). Confidence intervals are 
added with errorbars but are hardly visible (at the selected scale). 
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FIG. 7: Left: Probability density functions of the GEV for the 21 values of Te in the considered 
range [10,50]. Right: same as left for 8.75 < T E < 9.75. 




FIG. 8: Power law fits of the inferred values of log(/z) (vertical axis) as a function of log(Tg) 
(horizontal axis). From left to right: 1000, 300, and 100 yearly maxima have been used. In each 
case, there are two intervals of Te, separated by a point T E , characterized by a different scaling 
exponent, compare Tab. [I] 




FIG. 9: Same as Fig. El for log(cr), see Tab. HQ 




FIG. 10: Profile likelihood plots of f for T E = 32. From left to right, 1000, 300, 100, and 50 yearly 
maxima have been used, respectively. In the latter case, the estimate of £ is positive. Confidence 
intervals are computed by bootstrap, profile likelihood, and observed information matrix (the three 
stacked lines at the bottom part of the plots, from top to bottom, respectively). Notice the increasing 
width of the confidence intervals (quite large already for length 100) and the agreement between 
confidence intervals computed by the three methods, also for the wrong estimate obtained with 50 
maxima. 
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FIG. 11: Profile likelihood plots of the 100-year return level for Te = 32 for different lengths of 
the sample of yearly maxima: 1000, 300, 100, and 50 yearly maxima have been used from left to 
right, respectively. Confidence intervals are computed by bootstrap, profile likelihood, and observed 
information matrix (the three stacked lines at the bottom part of the plots, from top to bottom, 
respectively). Notice the increasing width of the confidence intervals and increasing skewness of 
those obtained by profile likelihood. 
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FIG. 12: Diagnostic plots of the GEV inferences for Te = 32. Top and bottom row: quantile- 
quantile and return level plots, respectively (see Sec. lIIBl for definitions). From left to right column: 
sequences of yearly maxima of the total energy are used, having lengths 1000, 300, 100, and 50, 
respectively. Notice the different scale of the vertical axis in the rightmost return level plot. 
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FIG. 13: Top row: maximum likelihood estimates of fx, a, and £ (from left to right, respectively) 
for 1000 and 300 yearly maxima (green and red respectively), with confidence intervals computed 
by the observed information matrix (JJJ). Center, bottom row: same as top, for 100 and 50 yearly 
maxima, respectively, instead of 300. In the case of 50 maxima, for T# = 32 and 50 the inferred 
values of £ are positive (thus completely wrong according to the theoretical expectation, see text) 
and the uncertainties are very large for a and £. 




FIG. 14: Inferred values of GEV parameters as a function of Tg (horizontal axis) in a soft extremes 
experiment: from top to bottom row, sequences of 1000 maxima of the total energy time series are 
used, where the maxima are determined over data blocks corresponding to 3, 1.2, and 0.6 months. 
From left to right column, fi, a, £, and 100-year return levels are plotted. In green the estimates 
obtained for the yearly maxima (as in Fig. [5J) are displayed for reference. Notice how the magnitude 
of the uncertainties shows little dependence on the temporal block length. 
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FIG. 15: Left: relative difference (/j,jt — ^64,) /a*64 of the maximum likelihood estimates of the GEV 
parameter fx (vertical axis) for resolutions JT = 8, 16, 32 (red, green, and blue, respectively) with 
respect to the reference case JT = 64. Middle, right: estimates of a and £, respectively (vertical 
axis), for the cases JT = 8, 16,32,64 (red, green, blue, magenta, respectively), where sequences of 
1000 maxima are used. On the horizontal axis, the value of Te is given for which the simulations 
are performed. 



